home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Night Owl 6
/
Night Owl's Shareware - PDSI-006 - Night Owl Corp (1990).iso
/
026a
/
j1_xtab.zip
/
JP1_XTAB.DOC
< prev
next >
Wrap
Text File
|
1990-09-23
|
7KB
|
123 lines
Documentation for JP1_XTAB.PRG
Purpose: Extend crosstab() with three operations, AVG, STD and VAR,
and by adding summary row and column.
Author: Original crosstab code by Bill Ramos and Kirk Nason,
copyright by Ashton-Tate.
Modified by Jay Parsons, CIS 70160,340, AT-BBS Jparsons
Modifications are copyrighted; see .PRG file for license.
Version: 1.0 <grin>.
Date: September 23, 1990.
*-----------------------------------------------------------------------------
Special Note
If you don't have the Control Center Booster, you will NOT be able to
use the attached file JP1_XTAB.PRG. If you do have the booster, BACK IT UP.
Then delete from it all the code for these three items:
FUNCTION crosstab()
PROCEDURE XT_DisGets
PROCEDURE Com_Xtab
Append the JP1_ZTAB.PRG file to the truncated CCBOOSTR.PRG file. Then
start dBASE, compile and activate the amended file by "SET PROC TO CCBOOSTR"
(or, if you have prudently given the amended file another name, SET PROC TO
<its name>, and type ASSIST to use it. Activate CAMPING.DBF or another file,
move to the Query column and choose CREATE, and type "crosstab()" into any
column but the first "filename pothandle."
Like the original code, this code leaves you with files selected and
open other than the ones originally selected. The 8-record file CAMPING.DBF
is included for testing, etc.
*----------------------------------------------------------------------------
Description of modifications
The modifications made are of two distinct sorts; the code is rather
roughly commented to indicate which parts may be removed if it is desired to
remove one or the other of the modifications.
The first modification is to support the functions AVG, STD and VAR,
representing the average, population standard deviation and population variance
of the items tabulated in addition to the original SUM, CNT, MAX and MIN. I am
not a statistician and do not know whether the sample deviation and variance
would be more meaningful measures of dispersion of the cross-tabulated groups;
if so will anyone knowing that to be the case please notify me. In any event,
changing these requires simply subtracting 1 from the denominator of the
formula used to calculate the variance (the final denominator, not the internal
division of the square of the sum of the values.)
The formula used for variance is:
( Sum of squares of items less
( square of sum of items / number of items ) )
all divided by number of items. && s.b.- 1 for sample statistics.
And, of course, std = sqrt( var ).
For simplicity, the extended functions report a 0 value where there
are no members of the detail crosstabulated group ("cell"). That is, in the
enclosed Camping file Jane does not buy a Stove, so there is nothing in the
cell where the row "Jane" intersects the column "Stove". None of the statistics
AVG, STD and VAR are meaningful, so a 0 is output to the Crosstab.dbf file.
This creates a problem. In the Camping file, Jane buys only one Tent.
Since there is no variance or standard deviation among one item, a zero also
appears here in the Crosstab file if these operations are chosen. However,
since Jane has bought three items in all, the summary column properly reflects
the standard deviation or variance among the prices of all three items. This
looks incorrect, because the summary column will contain a figure quite
unrelated to the figures visible in the detail cells. I know of no way to
remove this peculiarity other than to copy all the values to a file of all-
character fields, replace the zeroes with "N/A" where there are no members,
and perhaps also include the number of members in each group with it. This
has NOT been done in the interest of preserving the original structure of
one displayed number per cell. Removing the support for summary rows and
columns will make the output look less goofy, but will not eliminate the
possible confusion between a "0" meaning one item and no dispersion and a "0"
meaning no data.
To support the additional operations, up to two files, N_XTAB.DBF and
SQ_XTAB.DBF, are created (or will be overwritten) for operations that need
them. STD and VAR need both, AVG only N_XTAB. These files hold the numbers
(N_XTAB) of members in a cell or the sum of the squares of the members
(SQ_XTAB), both figures being needed for the statistical calculations. Also
created are one or two arrays with as many elements as the number of columns
plus two, which may push memory to the limit (try reducing dBheap, particularly
if you have a cache.) These are used to hold the "downward" totals of the
various columns of the other files.
As indicated above, the figures in the "Summary" row and column are
calculated using all members of the row or column, ignoring cells with no
members; they are not simply the sum, average, etc. of the figures in the
cells.
The second modification is the addition of a row and a column to the
crosstab table to hold the summaries. This modification is theoretically
independent of the support for additional operations and can be removed from
the code independently, although the effort to do everything in a single pass
causes commingling in the source code of the modifications. To comply with
the convention in the original code of arranging rows and columns in order
of the sort order of their fieldnames, the summary row and column are given
working names of "Z_____", the last possible fieldname, which is replaced at
display time with "Summary". The need to change the name without changing the
display order requires the addition of one additional field to Crosstab.dbf
holding the record number in display order.
As implemented, the "Summary" row or column gives the same statistic
as the other cells, applied to the members of its row or column or, in case of
the lowest and rightmost cell, to the whole file. That is, if MAX is chosen as
the operation, the maximum of each row will appear in the summary column to the
right, the maximum of each column at its foot, and the maximum of the values in
the entire file at lowest right. Similarly, STD will report the standard
deviations of the members of the cells in its row, column or file. Anyone
who feels this behavior is not appropriate is invited to call me in the
interest of developing a program that does what it should.
Jay Parsons